Integration of Data from a Syntactic Lexicon into Generative and Discriminative Probabilistic Parsers
نویسندگان
چکیده
This article evaluates the integration of data extracted from a syntactic lexicon, namely the Lexicon-Grammar, into several probabilistic parsers for French. We show that by modifying the Part-ofSpeech tags of verbs and verbal nouns of a treebank, we obtain accurate performances with a parser based on Probabilistic Context-Free Grammars (Petrov et al., 2006) and a discriminative parser based on a reranking algorithm (Charniak and Johnson, 2005).
منابع مشابه
A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing
Morphological processes in Semitic languages deliver space-delimited words which introduce multiple, distinct, syntactic units into the structure of the input sentence. These words are in turn highly ambiguous, breaking the assumption underlying most parsers that the yield of a tree for a given sentence is known in advance. Here we propose a single joint model for performing both morphological ...
متن کاملGeneralizing a Strongly Lexicalized Parser using Unlabeled Data
Statistical parsers trained on labeled data suffer from sparsity, both grammatical and lexical. For parsers based on strongly lexicalized grammar formalisms (such as CCG, which has complex lexical categories but simple combinatory rules), the problem of sparsity can be isolated to the lexicon. In this paper, we show that semi-supervised Viterbi-EM can be used to extend the lexicon of a generati...
متن کاملStudying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملFrench parsing enhanced with a word clustering method based on a syntactic lexicon
This article evaluates the integration of data extracted from a French syntactic lexicon, the Lexicon-Grammar (Gross, 1994), into a probabilistic parser. We show that by applying clustering methods on verbs of the French Treebank (Abeillé et al., 2003), we obtain accurate performances on French with a parser based on a Probabilistic Context-Free Grammar (Petrov et al., 2006).
متن کاملIntégration de ressources lexicales riches dans un analyseur syntaxique probabiliste. (Integration of lexical resources in a probabilistic parser)
This thesis focuses on the integration of lexical and syntactic resources of French in two fundamental tasks of Natural Language Processing [NLP], that are probabilistic part-of-speech tagging and probabilistic parsing. In the case of French, there are a lot of lexical and syntactic data created by automatic processes or by linguists. In addition, a number of experiments have shown interest to ...
متن کامل